CrowdDB: Query Processing with the VLDB Crowd

نویسندگان

  • Amber Feng
  • Michael J. Franklin
  • Donald Kossmann
  • Tim Kraska
  • Samuel Madden
  • Sukriti Ramesh
  • Andrew Wang
  • Reynold Xin
چکیده

Databases often give incorrect answers when data are missing or semantic understanding of the data is required. Processing such queries requires human input for providing the missing information, for performing computationally difficult functions, and for matching, ranking, or aggregating results based on fuzzy criteria. In this demo we present CrowdDB, a hybrid database system that automatically uses crowdsourcing to integrate human input for processing queries that a normal database system cannot answer. CrowdDB uses SQL both as a language to ask complex queries and as a way to model data stored electronically and provided by human input. Furthermore, queries are automatically compiled and optimized. Special operators provide user interfaces in order to integrate and cleanse human input. Currently CrowdDB supports two crowdsourcing platforms: Amazon Mechanical Turk and our own mobile phone platform. During the demo, the mobile platform will allow the VLDB crowd to participate as workers and help answer otherwise impossible queries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Master's Thesis Nr. 2 Crowddb – Answering Queries with Crowdsourcing

Despite the advances in the areas of databases and information retrieval, there still remain certain types of queries that are difficult to answer using machines alone. Such queries require human interaction to either provide data that is not readily available to machines or to gain more information from existing electronic data. CrowdDB is a database system that enables difficult queries to be...

متن کامل

Using the Crowd to Solve Database Problems

1. ABSTRACT OF THE INVITED TALK Database systems have been quite successful over the last decades for a wide range of applications. One of the strengths of the current generation of database products is their precise , formalized semantics based on the relational data model (i.e., SQL) and a closed world assumption. These properties allow databases to scale well and to perform a number of optim...

متن کامل

December: A Declarative Tool for Crowd Member Selection

Adequate crowd selection is an important factor in the success of crowdsourcing platforms, increasing the quality and relevance of crowd answers and their performance in different tasks. The optimal crowd selection can greatly vary depending on properties of the crowd and of the task. To this end, we present December, a declarative platform with novel capabilities for flexible crowd selection. ...

متن کامل

Argonaut: Macrotask Crowdsourcing for Complex Data Processing

Crowdsourced workflows are used in research and industry to solve a variety of tasks. The databases community has used crowd workers in query operators/optimization and for tasks such as entity resolution. Such research utilizes microtasks where crowd workers are asked to answer simple yes/no or multiple choice questions with little training. Typically, microtasks are used with voting algorithm...

متن کامل

Approximate Query Processing: Taming the TeraBytes

2 Garofalakis & Gibbons, VLDB 2001 # Outline • Intro & Approximate Query Answering Overview – Synopses, System architecture, Commercial offerings • One-Dimensional Synopses – Histograms, Samples, Wavelets • Multi-Dimensional Synopses and Joins – Multi-D Histograms, Join synopses, Wavelets • Set-Valued Queries – Using Histograms, Samples, Wavelets • Advanced Techniques & Future Directions – Stre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2011